Efficient Construction of Comprehensible Hierarchical Clusterings

نویسندگان

  • Luis Talavera
  • Javier Béjar
چکیده

Clustering is an important data mining task which helps in nding useful patterns to summarize the data. In the KDD context, data mining is often used for description purposes rather than for prediction. However, it turns out diicult to nd clustering systems that help to ease the interpretation task to the user in both, statistics and Machine Learning elds. In this paper we present Isaac, a hierarchical clustering system which employs traditional clustering ideas combined with a feature selection mechanism and heuristics in order to provide compre-hensible results. At the same time, it allows to eeciently deal with large datasets by means of a preprocessing step. Results suggest that these aims are achieved and encourage further research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Declarative Knowledge in Hierarchical Clustering Tasks

The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsuper-vised learning task, clustering appears to be one of the tasks that more beneets might obtain from prior knowledge. In this paper, we propose a method for providing declarative prior knowledge to a hierarchical clustering system stressing the interactive component. Pre...

متن کامل

Learning Multiple Hierarchical Relational Clusterings

Three important generalizations of the basic clustering problem are relational, hierarchical, and multiple clustering. This paper proposes the first approach to clustering that unifies all three. We describe a general probabilistic model for relational clustering, and show that flat, hierarchical and multiple relational clustering models are special cases. This paper also describes an efficient...

متن کامل

Optimization and Simplification of Hierarchical Clusterings

Clustering is often used to discover structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. In general, a search strategy cannot both (1) consistently construct clusterings of high quality and (2) be computationally inexpensive. However, we can partition the search so that a sys...

متن کامل

Exploring Eecient Attribute Prediction in Hierarchical Clustering

This work explores the feasibility of constructing hierarchical clusterings minimizing the expected cost of exploiting these clusterings for a prediction task. Particularly, we focus on gaining eeciency by means of reducing the number of features used to describe each node in the hierarchy. To explore a number of diierent hierarchical clusterings we use the Isaac clustering system, which can se...

متن کامل

MultiDendrograms: Variable-Group Agglomerative Hierarchical Clusterings

MultiDendrograms is a Java-written application that computes agglomerative hierarchical clusterings of data. Starting from a distances (or weights) matrix, MultiDendrograms is able to calculate its dendrograms using the most common agglomerative hierarchical clustering methods. The application implements a variable-group algorithm that solves the non-uniqueness problem found in the standard pai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998